Skip to content

fix(routing): prevent QueryEvent publish races#3490

Open
lidel wants to merge 2 commits intomasterfrom
fix/publish-query-event-race
Open

fix(routing): prevent QueryEvent publish races#3490
lidel wants to merge 2 commits intomasterfrom
fix/publish-query-event-race

Conversation

@lidel
Copy link
Copy Markdown
Member

@lidel lidel commented Apr 19, 2026

The bug

PublishQueryEvent forwards QueryEvent.Responses to subscribers by pointer. A publisher that keeps mutating its []*peer.AddrInfo (or any AddrInfo.Addrs) after the call returns races with every subscriber reading the event.

This is the canonical shape in routing systems, for example DHT implementations: a worker aggregates "closer peers" into a slice, publishes it as a PeerResponseevent, then continues processing the same slice (e.g. enrichingAddrs` from a peerstore).

This class of race was reported against in ipfs/kubo#11287 and ipfs/kubo#11116.

Proof

To make review easier, the first commit (test: PublishQueryEvent Responses race) adds a minimal test that fails under -race. CI caught it on go-test before the fix:

https://github.com/libp2p/go-libp2p/actions/runs/24641378021/job/72045816212?pr=3490#step:19:580

The fix

Note

This is a general fix, not a kad-dht-specific one. Any current or future routing/DHT implementation that emits QueryEvents can hit the same race; centralizing the copy in core/routing covers all of them at once. I feel this is the right call to avoid similar bugs surfacing every few months when someone moves something around, or if golang changes something to make race more probable.

PublishQueryEvent now deep-copies Responses (and each AddrInfo.Addrs) before handing the event to subscribers. Callers get a strengthened contract: mutate your own copy freely after publishing.

  • Zero cost when nobody subscribes (the existing ctx.Value early return is untouched)
  • Multiaddr values stay shared; they are by-convention immutable
  • No API break

lidel added 2 commits April 20, 2026 01:05
Clone Responses (and each AddrInfo.Addrs) before delivery so a
publisher can safely mutate its slice after the call returns.
@lidel lidel changed the title test: PublishQueryEvent Responses race fix(routing): prevent QueryEvent publish races Apr 19, 2026
@lidel lidel marked this pull request as ready for review April 19, 2026 23:36
@lidel lidel mentioned this pull request Apr 19, 2026
31 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant